3,382 research outputs found
An illustration of the risk of borrowing information via a shared likelihood
A concrete, stylized example illustrates that inferences may be degraded,
rather than improved, by incorporating supplementary data via a joint
likelihood. In the example, the likelihood is assumed to be correctly
specified, as is the prior over the parameter of interest; all that is
necessary for the joint modeling approach to suffer is misspecification of the
prior over a nuisance parameter
A Structural Approach to Coordinate-Free Statistics
We consider the question of learning in general topological vector spaces. By
exploiting known (or parametrized) covariance structures, our Main Theorem
demonstrates that any continuous linear map corresponds to a certain
isomorphism of embedded Hilbert spaces. By inverting this isomorphism and
extending continuously, we construct a version of the Ordinary Least Squares
estimator in absolute generality. Our Gauss-Markov theorem demonstrates that
OLS is a "best linear unbiased estimator", extending the classical result. We
construct a stochastic version of the OLS estimator, which is a continuous
disintegration exactly for the class of "uncorrelated implies independent"
(UII) measures. As a consequence, Gaussian measures always exhibit continuous
disintegrations through continuous linear maps, extending a theorem of the
first author. Applying this framework to some problems in machine learning, we
prove a useful representation theorem for covariance tensors, and show that OLS
defines a good kriging predictor for vector-valued arrays on general index
spaces. We also construct a support-vector machine classifier in this setting.
We hope that our article shines light on some deeper connections between
probability theory, statistics and machine learning, and may serve as a point
of intersection for these three communities.Comment: 31 page
Shrinkage priors for linear instrumental variable models with many instruments
This paper addresses the weak instruments problem in linear instrumental
variable models from a Bayesian perspective. The new approach has two
components. First, a novel predictor-dependent shrinkage prior is developed for
the many instruments setting. The prior is constructed based on a factor model
decomposition of the matrix of observed instruments, allowing many instruments
to be incorporated into the analysis in a robust way.
Second, the new prior is implemented via an importance sampling scheme, which
utilizes posterior Monte Carlo samples from a first-stage Bayesian regression
analysis. This modular computation makes sensitivity analyses straightforward.
Two simulation studies are provided to demonstrate the advantages of the new
method. As an empirical illustration, the new method is used to estimate a key
parameter in macro-economic models: the elasticity of inter-temporal
substitution. The empirical analysis produces substantive conclusions in line
with previous studies, but certain inconsistencies of earlier analyses are
resolved.Comment: 27 pages, 6 figures, 3 table
Decoupling shrinkage and selection in Bayesian linear models: a posterior summary perspective
Selecting a subset of variables for linear models remains an active area of
research. This paper reviews many of the recent contributions to the Bayesian
model selection and shrinkage prior literature. A posterior variable selection
summary is proposed, which distills a full posterior distribution over
regression coefficients into a sequence of sparse linear predictors.Comment: 30 pages, 6 figures, 2 table
Predictor-dependent shrinkage for linear regression via partial factor modeling
In prediction problems with more predictors than observations, it can
sometimes be helpful to use a joint probability model, , rather than
a purely conditional model, , where is a scalar response
variable and is a vector of predictors. This approach is motivated by the
fact that in many situations the marginal predictor distribution can
provide useful information about the parameter values governing the conditional
regression. However, under very mild misspecification, this marginal
distribution can also lead conditional inferences astray. Here, we explore
these ideas in the context of linear factor models, to understand how they play
out in a familiar setting. The resulting Bayesian model performs well across a
wide range of covariance structures, on real and simulated data.Comment: 16 pages, 1 figure, 2 table
Regret-based Selection for Sparse Dynamic Portfolios
This paper considers portfolio construction in a dynamic setting. We specify
a loss function comprised of utility and complexity components with an unknown
tradeoff parameter. We develop a novel regret-based criterion for selecting the
tradeoff parameter to construct optimal sparse portfolios over time
Efficient sampling for Gaussian linear regression with arbitrary priors
This paper develops a slice sampler for Bayesian linear regression models
with arbitrary priors. The new sampler has two advantages over current
approaches. One, it is faster than many custom implementations that rely on
auxiliary latent variables, if the number of regressors is large. Two, it can
be used with any prior with a density function that can be evaluated up to a
normalizing constant, making it ideal for investigating the properties of new
shrinkage priors without having to develop custom sampling algorithms. The new
sampler takes advantage of the special structure of the linear regression
likelihood, allowing it to produce better effective sample size per second than
common alternative approaches
XBART: Accelerated Bayesian Additive Regression Trees
Bayesian additive regression trees (BART) (Chipman et. al., 2010) is a
powerful predictive model that often outperforms alternative models at
out-of-sample prediction. BART is especially well-suited to settings with
unstructured predictor variables and substantial sources of unmeasured
variation as is typical in the social, behavioral and health sciences. This
paper develops a modified version of BART that is amenable to fast posterior
estimation. We present a stochastic hill climbing algorithm that matches the
remarkable predictive accuracy of previous BART implementations, but is many
times faster and less memory intensive. Simulation studies show that the new
method is comparable in computation time and more accurate at function
estimation than both random forests and gradient boosting
Variable Selection in Seemingly Unrelated Regressions with Random Predictors
This paper considers linear model selection when the response is
vector-valued and the predictors are randomly observed. We propose a new
approach that decouples statistical inference from the selection step in a
"post-inference model summarization" strategy. We study the impact of predictor
uncertainty on the model selection procedure. The method is demonstrated
through an application to asset pricing
A Bayesian hierarchical model for inferring player strategy types in a number guessing game
This paper presents an in-depth statistical analysis of an experiment
designed to measure the extent to which players in a simple game behave
according to a popular behavioral economic model. The p-beauty contest is a
multi-player number guessing game that has been widely used to study strategic
behavior. This paper describes beauty contest experiments for an audience of
data analysts, with a special focus on a class of models for game play called
k-step thinking models, which allow each player in the game to employ an
idiosyncratic strategy. We fit a Bayesian statistical model to estimate the
proportion of our player population whose game play is compatible with a k-step
thinking model. Our findings put this number at approximately 25%.Comment: 46 pages, 14 figures, 2 table
- …